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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1457 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 115. .1326 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TGTTGAGTCA ACAGCTGTAG ATACACCAAT TGTTGCCGAT TTCTTTCTTT TCGACTGTCG 

GCTTCTCGCG AAACTGTGAT TGTGAAAATT GTACAAATAG AGGCAAATTT AACC ATG 

Met 
1 



MAR15 2DWS 




GCG CAC ATG 
Ala His Met 



TCC CAC 
Ser His 
5 



ATG CTC 
Met Leu 



CAG CAG 
Gin Gin 
10 



CCT TCG 
Pro Ser 



GGG TCG ACG 
Gly Ser Thr 
15 



CCC TCC 
Pro Ser 



165 



AAC GTG GGC 
Asn Val Gly 
20 



TCC AGC 
Ser Ser 



TCA TCG 
Ser Ser 



CGC ACG 
Arg Thr 
25 



ATG TCC 
Met Ser 



CTG ATG GAG 
Leu Met Glu 
30 



AAA CAA 
Lys Gin 



213 



AAG TAC ATC 
Lys Tyr lie 
35 



GAG GAC 
Glu Asp 



TAC GAC 
Tyr Asp 
40 



TTT CCC 
Phe Pro 



TAC TGC 
Tyr Cys 



GAC GAG AGC 
Asp Glu Ser 
45 



AAC AAA 
Asn Lys 



261 



TAC GAA AAG 
Tyr Glu Lys 
50 



GTG GCG 
Val Ala 



AAA ATT 
Lys lie 
55 



GGC CAA 
Gly Gin 



GGC ACC 
Gly Thr 
60 



TTC GGA GAG 
Phe Gly Glu 



GTT TTT 
Val Phe 
65 



309 



AAG GCT CGC 
Lys Ala Arg 



GAG AAA 
Glu. Lys 
70 



AAG GGC 
Lys Gly 



AAC AAG 
Asn Lys 



AAG TTT 
Lys Phe 
75 



GTG GCC ATG 
Val Ala Met 



AAG AAG 
Lys Lys 
80 



357 



GTG CTG ATG 
Val Leu Met 



GAC AAC 
Asp Asn 
85 



GAA AAG 
Glu Lys 



GAG GGC 
Glu Gly 
90 



TTT CCC 
Phe Pro 



ATC ACG GCT 
He Thr Ala 
95 



CTG CGA 
Leu Arg 



405 



GAG ATC CGC 
Glu He Arg 
100 



ATC CTG 
He Leu 



CAG CTG 
Gin Leu 



CTA AAG 
Leu Lys 
105 



CAC GAG 
His Glu 



AAC GTG GTG 
Asn Val Val 
110 



AAT CTG 
Asn Leu 



453 



ATC GAG ATC 
He Glu He 
115 



TGC CGC 
Cys Arg 



ACC AAG 
Thr Lys 
120 



GCC ACC 
Ala Thr 



GCC ACG 
Ala Thr 



AAT GGT TAC 
Asn Gly Tyr 
125 



AGA TCC 
Arg Ser 



501 



ACC TTC TAT 
Thr Phe Tyr 
130 



TTG GTC 
Leu Val 



TTT GAT 
Phe Asp 
135 



TTC TGC 
Phe Cys 



GAA CAC 
Glu His 
140 



GAT TTG GCA 
Asp Leu Ala 



GGT CTT 
Gly Leu 
145 



549 



CTG TCC AAC 
Leu Ser Asn 



ATG AAC 
Met Asn 
150 



GTC AAG 
Val Lys 



TTC AGT 
Phe Ser 



CTG GGC 
Leu Gly 
155 



GAG ATT AAG 
Glu He Lys 



AAG GTT 
Lys Val 
160 



597 



ATG CAG CAG 
Met Gin Gin 



CTT TTA 
Leu Leu 
165 



AAC GGT 
Asn Gly 



TTG TAT 
Leu Tyr 
170 



TAC ATC 
Tyr He 



CAC AGC AAC 
His Ser Asn 
175 



AAG ATC 
Lys He 



645 



CTG CAC CGA 
Leu His Arg 
180 



GAC ATG 
Asp Met 



AAA GCT 
Lys Ala 



GCC AAC 
Ala Asn 
185 



GTG CTG 
Val Leu 



ATT ACC AAG 
He Thr Lys 
190 



CAT GGC 
His Gly 



693 



ATC TTA AAG 
He Leu Lys 
195 



CTG GCT 
Leu Ala 



GAC TTT 
Asp Phe 
200 



GGC TTG 
Gly Leu 



GCC CGT 
Ala Arg 



GCT TTT AGC 
Ala Phe Ser 
205 



ATT CCA 
He Pro 



741 



AAG AAC GAG 
Lys Asn Glu 
210 



AGT AAG 
Ser Lys 



AAT CGC 
Asn Arg 
215 



TAT ACC 
Tyr Thr 



AAT CGC 
Asn Arg 
220 



GTA GTA ACC 
Val Val Thr 



TTG TGG 
Leu Trp 
225 



789 



TAC CGG CCG CCT GAG CTG CTA CTT GGT GAC CGC AAC TAT GGT CCA CCC 
Tyr Arg Pro Pro Glu Leu Leu Leu Gly Asp Arg Asn Tyr Gly Pro Pro 
230 235 240 



837 



GTG GAC ATG TGG GGA GCC GGC TGC ATA ATG GCC GAG ATG TGG ACA CGC 885 
Val Asp Met Trp Gly Ala Gly Cys He Met Ala Glu Met Trp Thr Arg 
245 250 255 

TCG CCC ATC ATG CAA GGC AAT ACG GAG CAG CAG CAG TTA ACC TTT ATT 933 
Ser Pro He Met Gin Gly Asn Thr Glu Gin Gin Gin Leu Thr Phe He 
260 265 270 

TCG CAG CTA TGC GGC TCC TTT ACG CCG GAC GTG TGG CCG GGA GTG GAG 981 
Ser Gin Leu Cys Gly Ser Phe Thr Pro Asp Val Trp Pro Gly Val Glu 
275 280 285 

GAG CTG GAG CTG TAC AAA TCC ATC GAG CTG CCA AAG AAC CAG AAG CGT 1029 
Glu Leu Glu Leu Tyr Lys Ser He Glu Leu Pro Lys Asn Gin Lys Arg 
290 295 300 305 

CGA GTC AAG GAG CGC CTG CGT CCG TAT GTC AAG GAT CAA ACC GGC TGT 1077 
Arg Val Lys Glu Arg Leu Arg Pro Tyr Val Lys Asp Gin Thr Gly Cys 
310 -315 320 

GAT CTA TTG GAC AAA TTG CTG ACC CTT GAT CCC AAG AAA CGC ATC GAT 1125 
Asp Leu Leu Asp Lys Leu Leu Thr Leu Asp Pro Lys Lys Arg He Asp 
325 330 335 

GCG GAC ACA GCT CTG AAT CAC GAC TTC TTC TGG ACG GAT CCC ATG CCC 1173 
Ala Asp Thr Ala Leu Asn His Asp Phe Phe Trp Thr Asp Pro Met Pro 
340 345 350 

AGC GAC TTG AGC AAG ATG CTG TCC CAG CAC CTG CAG AGC ATG TTC GAG 1221 
Ser Asp Leu Ser Lys Met Leu Ser Gin His Leu Gin Ser Met Phe Glu 
355 360 365 

TAC CTG GCG CAG CCA CGC CGC AGC AAC CAG ATG CGC AAC TAT CAC CAG 1269 
Tyr Leu Ala Gin Pro Arg Arg Ser Asn Gin Met Arg Asn Tyr His Gin 
370 375 380 385 

CAA CTG ACC ACC ATG AAC CAG AAG CCC CAG GAC AAC AGT ATG ATT GAC 1317 
Gin Leu Thr Thr Met Asn Gin Lys Pro Gin Asp Asn Ser Met He Asp 
390 395 400 

CGG GTT TGG TAGACTGCCA GAGGTGTACG CACCCGACTA ATAGTTTCTC 1366 
Arg Val Trp 



ACCTTCAACT AGCGTTAGGT TATTAGGTTA GTGTACAATA AAAATATTGG CATTTGCATT 1426 
AGCGCTTGCT CCAAATATAA AAAAAAAAAA A 14 57 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 04 amino acids 



(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala His Met Ser His Met Leu Gin Gin Pro Ser Gly Ser Thr Pro 
15 10 15 

Ser Asn Val Gly Ser Ser Ser Ser Arg Thr Met Ser Leu Met Glu Lys 
20 25 30 

Gin Lys Tyr lie Glu Asp Tyr Asp Phe Pro Tyr Cys Asp Glu Ser Asn 
35 40 45 

Lys Tyr Glu Lys Val Ala Lys lie Gly Gin Gly Thr Phe Gly Glu Val 
50 55 60 

Phe Lys Ala Arg Glu Lys Lys Gly Asn Lys Lys Phe Val Ala Met Lys 
65 70 75 80 

Lys Val Leu Met Asp Asn Glu Lys Glu Gly Phe Pro lie Thr Ala Leu 
85 90 95 



Arg Glu lie Arg lie Leu Gin Leu Leu Lys His Glu Asn Val Val Asn 
100 105 110 

Leu lie Glu lie Cys Arg Thr Lys Ala Thr Ala Thr Asn Gly Tyr Arg 
115 120 125 

Ser Thr Phe Tyr Leu Val Phe Asp Phe Cys Glu His Asp Leu Ala Gly 
130 135 140 

Leu Leu Ser Asn Met Asn Val Lys Phe Ser Leu Gly Glu lie Lys Lys 
145 150 155 160 

Val Met Gin Gin Leu Leu Asn Gly Leu Tyr Tyr lie His Ser Asn Lys 
165 170 175 

lie Leu His Arg Asp Met Lys Ala Ala Asn Val Leu lie Thr Lys His 
180 185 190 

Gly lie Leu Lys Leu Ala Asp Phe Gly Leu Ala Arg Ala Phe Ser lie 
195 200 205 

Pro Lys Asn Glu Ser Lys Asn Arg Tyr Thr Asn Arg Val Val Thr Leu 
210 215 220 

Trp Tyr Arg Pro Pro Glu Leu Leu Leu Gly Asp Arg Asn Tyr Gly Pro 
225 230 235 240 

Pro Val Asp Met Trp Gly Ala Gly Cys He Met Ala Glu Met Trp Thr 
245 250 255 



Arg Ser Pro He Met Gin Gly Asn Thr Glu Gin Gin Gin Leu Thr Phe 



lie Ser Gin 
275 

Glu Glu Leu 
290 



260 

Leu Cys Gly Ser 



265 

Phe Thr 
280 



Glu Leu Tyr Lys Ser lie 
295 



270 

Pro Asp Val Trp Pro Gly Val 
285 

Glu Leu Pro Lys Asn Gin Lys 
300 



Arg Arg Val 
305 



Lys Glu Arg Leu Arg Pro 
310 



Tyr Val Lys Asp Gin Thr Gly 
315 320 



Cys Asp Leu 
Asp Ala Asp 



Leu Asp Lys Leu Leu Thr 
325 

Thr Ala Leu Asn His Asp 
340 345 



Leu Asp Pro Lys Lys Arg lie 
330 335 

Phe Phe Trp Thr Asp Pro Met 
350 



Pro Ser Asp 
355 



Leu Ser Lys Met 



Leu Ser Gin His Leu Gin Ser Met Phe 
360 365 



Glu Tyr Leu 
370 

Gin Gin Leu 
385 



Ala Gin Pro Arg Arg Ser 
375 

Thr Thr Met Asn Gin Lys 
390 



Asn Gin Met Arg Asn Tyr His 
380 

Pro Gin Asp Asn Ser Met lie 
395 400 



Asp Arg Val Trp 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4328 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAGCCCTGCC GACGGCCATA CTTGAAAATA CATTTTTTTC TGCAAAGTTT GTCATTGTCA 60 

CTGTGTGAAT GGAATCTGTG ATGTGTTGTG GAATTAAAAA CGTCAAGTAA ACAACCCGTA 120 

ATGGTTAAAG TGCACGGCGA AAGCAGTGCG AATAACTATG AATTGATACA AAAGTTGCAT 180 

AACACGTCGC CTGGTGTCGC GGTTAGTGTG TTTTTCGTCT CGTTTCGTTT CCGCCGCAGT 24 0 

CGCAGTTTCC AAAAAACCTC ACCACACCAT ACCATCTCCA CCACGCACAC ACACACACAA 300 

ACAAACACGC AGAGACGCGG CGGCGGAAAA AGTGTGCGGA CCGCGGATTT AACCCCTCGT 360 

TCCAAACCCA AATTGGAGTC TCCCAAAAAC AGCGAAATAT CGAGTGTGGC TTAGCCGATG 4 20 

TGCCGTGCGA TCCCCACTGC CCCTTCCGTA CCGCTGCCAC CCCCGCCACA GCAGCAACGC 480 

ACACGGATAC GG AC AC AG AC ACCAATACCA GCGCACTCAA GCACGGCCGA CAAAGAAAGA 540 



GCGCTCTCCC TTCCTCTTTG TACAGTTAGT TCCTACAGCT GAATCAGCCA AAAGAAATTA 600 
CTAGGTCCAT TCCGAGGCGC AGTTTGCATG TGAAACGGAG GTCCCCGCAT AACCACGCGG 660 
AACCCGAAAT TCCAGATCCC CATCTCCGCT GCACGGATAA AGGAAACATA CAACCATGAG 720 
TCTCCTAGCC ACGCCAATGC CCCAGGCGGC CACCGCCTCA TCTTCTTCAT CCGCCTCCGC 780 
GGCCGCCTCG GCCAGCGGGA TTCCAATCAC CGCCAACAAC AACCTGCCTT TCGAGAAGGA 84 0 
CAAGATCTGG TACTTCAGCA ACGATCAGCT GGCCAATTTG CCAAGCAGAA GATGCGGCAT 900 
CAAGGGCGAC GATGAGCTGC AGTACCGCCA GATGACCGCC TAT CT GAT AC AGGAAATGGG 960 

TCAGCGTCTG CAGGTGTCCC AACTGTGCAT CAACACGGCC ATTGTGTACA TGCATCGGTT 1020 

CTACGCCTTT CACTCCTTCA CCCACTTTCA TCGCAACTCC ATGGCGTCGG CGAGCCTCTT 1080 

CTTGGCCGCC AAGGTAGAAG AGCAACCGCG GAAGCTGGAG CATGTTATTC GGGCCGCCAA 1140 

CAAGTGCCTG CCGCCGACCA CCGAGCAGAA TTACGCCGAA CTCGCCCAGG AGCTTGTGTT 1200 

CAACGAGAAC GTGCTCCTGC AGACGCTGGG CTTCGATGTG GCCATCGATC ATCCGCACAC 1260 

GCATGTGGTG CGCACCTGCC AGCTGGTCAA AGCATGCAAG GATCTGGCGC AGACATCGTA 1320 

CTTCTTGGCC TCGAACAGCC TGCATCTGAC CTCGATGTGC CTCCAATATC GCCCCACGGT 1380 

CGTAGCCTGT TTCTGCATTT ACCTAGCCTG CAAGTGGTCC CGATGGGAGA TCCCCCAGTC 14 40 

GACCGAGGGC AAGCACTGGT TCTACTATGT GGACAAGACG GTCTCGCTGG ATTTGCTAAA 1500 

GCAGCTGACA GATGAGTTCA TCGCTATCTA TGAGAAGAGC CCGGCCCGTC TGAAGTCTAA 1560 

GCTTAACTCG ATCAAGGCGA TCGCCCAGGG AGCCAGCAAT CGGACAGCTA ACAGCAAGGA 1620 

CAAACCAAAG GAGGACTGGA AGATCACCGA GATGATGAAG GGCTACCACT CAAACATCAC 1680 

GACACCACCA GAGCTGTTAA ACGGCAACGA CAGCCGGGAT CGGGACCGAG ATCGTGAACG 174 0 

G GAG AG AG AG CGGGAACGGG ATCCGTCGTC ACTACTGCCG CCACCGGCTA TGGTGCCGCA 1800 

GCAAAGACGA CAGGATGGTG GACATCAGCG CTCGTCCTCA GTGAGCGGAG TGCCAGGCAG 18 60 

CAGCTCTTCG TCGTCTTCCT CCAGTCACAA GATGCCAAAT TACCCTGGTG GCATGCCGCC 1920 

CGAAGCTCAT CCGGATCACA AGTCAAAGCA GCCGGGCTAT AACAATCGAA TGCCCTCAAG 1980 

TCACCAGCGT AGTAGTAGCA GTGGACTCGG TTCCTCGGGA AGTGGCAGCC AG C AC AG C AG 204 0 

CTCATCCTCG TCGTCTTCAA GCCAGCAGCC TGGCCGACCG TCTATGCCCG TGGACTATCA 2100 

CAAATCCTCT CGCGGCATGC CGCCGGTAGG CGTGGGCATG CCACCTCACG GCAGCCACAA 2160 

GATGACTTCG GGCTCCAAGC CTCAACAGCC GCAGCAGCAG CCGGTCCCAC ATCCATCCGC 2220 



CTCTAATTCC TCTGCATCGG GCATGTCCTC CAAGGATAAA TCCCAGAGCA ACAAAATGTA 2280 

TCCGAACGCA CCGCCGCCAT ACAGTAATAG TGCCCCTCAA AACCCGCTGA TGTCGCGTGG 234 0 

TGGATATCCA GGCGCTAGCA ATGGATCCCA GCCCCCGCCT CCCGCCGGAT ACGGCGGCCA 2400 

TCGCAGCAAA TCCGGCTCCA CCGTCCATGG CATGCCGCAT TTCGAGCAGC AATTGCCCTA 24 60 

TTCCCAGAGC CAGAGCTACG GCCACATGCA GCAGCAGCCA GTGCCTCAGT CTCAGCAGCA 2520 

ACAGATGCCT CCGGAGGCAT CCCAGCACTC GTTGCAGTCC AAGAACTCGC TCTTCAGTCC 2580 

AGAGTGGCCA GACATTAAAA AGGAGCCCAT GTCGCAGTCG CAACCACAGC TTTTTAACGG 264 0 

TTTGCTACCC CCTCCTGCGC CTCCCGGCCA CGATTACAAG CTAAATAGCC ATCCGCGCGA 2700 

CAAAGAAAGT CCCAAGAAAG AGCGACTAAC GCCAACCAAA AAGGATAAGC ACCGTCCTGT 27 60 

AATGCCCCCA ATGGGCAGTG GGAACAGTTC CTCCGGCTCG GGATCATCAA AGCCGATGCT 2820 

ACCGCCTCAC AAGAAGCAGA TACCCCATGG CGGGGACCTG TTGACCAATC CTGGAGAGAG 2880 

TGGAAGCCTA AAACGGCCCA ACGAGATCTC GGGAAGTCAG TATGGACTAA ATAAGCTGGA 2940 

TGAAATAGAT AACAGTAATA TGCCTCGAGA AAAGCTTCGC AAGCTGGACA CTACAACTGG 3000 

ACTACCAACT TATCCGAATT ATGAGGAGAA ACACACGCCT CTGAATATGT CCAACGGAAT 3060 

CGAGACAACG CCGGATCTGG TGCGCAGTTT GCTAAAGGAG AGTCTGTGTC CATCGAACGC 3120 

TTCGCTCCTG AAACCGGATG CCTTGACTAT GCCTGGCCTG AAACCACCGG CCGAACTACT 3180 

TGAGCCCATG CCCGCACCAG CGACAATCAA GAAAGAACAG GGAATAACTC CGATGACCAG 3240 

TTTGGCTAGT GGGCCCGCAC CCATGGATTT GGAAGTACCC ACTAAACAGG CCGGAGAGAT 3300 

TAAGGAGGAA AGCAGCAGCA AGTCCGAAAA GAAAAAGAAG AAGGATAAAC ACAAACACAA 3360 

GGAGAAGGAC AAGTCCAAGG ACAAGACGGA AAAGGAGGAG CGTAAGAAGC ACAAGAGGGA 3420 

CAAGCAGAAG GATCGTAGCG GCAGCGGTGG CAGCAAGGAC AGTTCTCTTC CCAATGAGCC 3480 

TCTGAAGATG GTTATCAAGA ATCCCAACGG CAGCCTGCAG GCCGGTGCGT CAGCTCCCAT 354 0 

TAAACTTAAG ATCAGCAAAA ATAAGGTTGA ACCCAATAAC TACTCTGCAG CGGCGGGTCT 3600 

GCCTGGCGCA ATCGGATATG GCTTGCCTCC AACTACGGCT ACCACCACAT CCGCTTCGAT 3660 

CGGAGCAGCT GCTCCTGTTC TGCCTCCTTA TGGTGCCGGC GGTGGTGGCT ACAGCTCATC 3720 

GGGCGGCAGC AGTTCCGGTG GCAGCAGCAA GAAAAAGCAC AGCGATCGTG ACCGCGACAA 3780 

GGAGAGCAAA AAGAATAAGA GCCAAGACTA CGCGAAGTAC AATGGCGCTG GTGGCGGCAT 384 0 

CTTTAATCCC CTTGGCGGTG CTGGCGCCGC ACCCAATATG TCTGGAGGAA TGGGCGCCCC 3900 



CATGTCTACT GCTGTACCAC CATCCATGCT GTTGGCGCCC ACCGGTGCAG TACCACCCTC 3960 

TGCCGCTGGG CTGGCACCGC CTCCCATGCC CGTCTACAAC AAGAAGTAGT GGTAGCGGTC 4 020 

AGAGGGTTAT TCTTAAGTCG TACGTTTTGA TATATGTATA GAACCTCAGT AAGTCCGATT 4080 

G TAG TAT AG T TGTTAGGATT GTTAGTGAGA TGCATTATTG ATTTTAGTTA AG C AC AT AG A 414 0 

TAAAACTCCA AATTGGAAGT GAAACCGGAT GCGCAGATCG AAGAAGAATG GAAGTAGATG 4200 

TCGCGATGGG GCTGGACGTA AAAGCAGTAC TCAAATCGCG AAAACTTTTG TACAGCATTA 4260 

ATTAGTTTAT AACTATAATA AATAGCATAC ATATAAGCCC AAAAAAAAAA AAAAAAAAAA 4320 

AAAAAAAA 4328 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1097 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Leu Leu Ala Thr Pro Met Pro Gin Ala Ala Thr Ala Ser Ser 
15 10 15 

Ser Ser Ser Ala Ser Ala Ala Ala Ser Ala Ser Gly lie Pro lie Thr 
20 25 30 

Ala Asn Asn Asn Leu Pro Phe Glu Lys Asp Lys lie Trp Tyr Phe Ser 
35 40 45 

Asn Asp Gin Leu Ala Asn Leu Pro Ser Arg Arg Cys Gly lie Lys Gly 
50 55 60 

Asp Asp Glu Leu Gin Tyr Arg Gin Met Thr Ala Tyr Leu lie Gin Glu 
65 70 75 80 

Met Gly Gin Arg Leu Gin Val Ser Gin Leu Cys lie Asn Thr Ala lie 
85 90 95- 

Val Tyr Met His Arg Phe Tyr Ala Phe His Ser Phe Thr His Phe His 
100 105 110 

Arg Asn Ser Met Ala Ser Ala Ser Leu Phe Leu Ala Ala Lys Val Glu 
115 120 125 

Glu Gin Pro Arg Lys Leu Glu His Val lie Arg Ala Ala Asn Lys Cys 
130 135 140 

Leu Pro Pro Thr Thr Glu Gin Asn Tyr Ala Glu Leu Ala Gin Glu Leu 
145 150 155 160 



Val Phe Asn Glu Asn Val Leu Leu Gin Thr Leu Gly Phe Asp Val Ala 
165 170 175 



lie Asp His Pro His Thr His Val Val Arg Thr Cys Gin Leu Val Lys 
180 185 190 

Ala Cys Lys Asp Leu Ala Gin Thr Ser Tyr Phe Leu Ala Ser Asn Ser 
195 200 205 

Leu His Leu Thr Ser Met Cys Leu Gin Tyr Arg Pro Thr Val Val Ala 
210 215 220 

Cys Phe Cys lie Tyr Leu Ala Cys Lys Trp Ser Arg Trp Glu lie Pro 
225 230 235 240 

Gin Ser Thr Glu Gly Lys His Trp Phe Tyr Tyr Val Asp Lys Thr Val 
245 250 255 

Ser Leu Asp Leu Leu Lys Gin Leu Thr Asp Glu Phe lie Ala lie Tyr 
260 265 270 

Glu Lys Ser Pro Ala Arg Leu Lys Ser Lys Leu Asn Ser lie Lys Ala 
275 280 285 

lie Ala Gin Gly Ala Ser Asn Arg Thr Ala Asn Ser Lys Asp Lys Pro 
290 295 300 

Lys Glu Asp Trp Lys lie Thr Glu Met Met Lys Gly Tyr His Ser Asn 
305 310 315 320 

lie Thr Thr Pro Pro Glu Leu Leu Asn Gly Asn Asp Ser Arg Asp Arg 
325 330 335 

Asp Arg Asp Arg Glu Arg Glu Arg Glu Arg Glu Arg Asp Pro Ser Ser 
340 345 350 

Leu Leu Pro Pro Pro Ala Met Val Pro Gin Gin Arg Arg Gin Asp Gly 
355 360 365 

Gly His Gin Arg Ser Ser Ser Val Ser Gly Val Pro Gly Ser Ser Ser 
370 375 380 

Ser Ser Ser Ser Ser Ser His Lys Met Pro Asn Tyr Pro Gly Gly Met 
385 390 395 400 

Pro Pro Glu Ala His Pro Asp His Lys Ser Lys Gin Pro Gly Tyr Asn 
405 410 415 

Asn Arg Met Pro Ser Ser His Gin Arg Ser Ser Ser Ser Gly Leu Gly 
420 425 430 



Ser Ser Gly Ser Gly Ser Gin His Ser Ser Ser Ser Ser Ser Ser Ser 
435 440 445 



Ser Gin Gin Pro Gly Arg Pro Ser Met Pro Val Asp Tyr His Lys Ser 
450 455 460 



Ser Arg Gly Met Pro Pro Val Gly Val Gly Met Pro Pro His Gly Ser 
465 470 475 480 



His Lys Met Thr Ser Gly Ser Lys Pro Gin Gin Pro Gin Gin Gin Pro 
485 490 495 

Val Pro His Pro Ser Ala Ser Asn Ser Ser Ala Ser Gly Met Ser Ser 
500 505 510 

Lys Asp Lys Ser Gin Ser Asn Lys Met Tyr Pro Asn Ala Pro Pro Pro 
515 520 525 

Tyr Ser Asn Ser Ala Pro Gin Asn Pro Leu Met Ser Arg Gly Gly Tyr 
530 535 540 

Pro Gly Ala Ser Asn Gly Ser Gin Pro Pro Pro Pro Ala Gly Tyr Gly 
545 550 555 560 

Gly His Arg Ser Lys Ser Gly Ser Thr Val His Gly Met Pro His Phe 
565 570 575 

Glu Gin Gin Leu Pro Tyr Ser Gin Ser Gin Ser Tyr Gly His Met Gin 
580 585 590 

Gin Gin Pro Val Pro Gin Ser Gin Gin Gin Gin Met Pro Pro Glu Ala 
595 600 605 

Ser Gin His Ser Leu Gin Ser Lys Asn Ser Leu Phe Ser Pro Glu Trp 
610 615 620 

Pro Asp lie Lys Lys Glu Pro Met . Ser Gin Ser Gin Pro Gin Leu Phe 
625 630 635 640 

Asn Gly Leu Leu Pro Pro Pro Ala Pro Pro Gly His Asp Tyr Lys Leu 
645 650 655 

Asn Ser His Pro Arg Asp Lys Glu Ser Pro Lys Lys Glu Arg Leu Thr 
660 665 670 

Pro Thr Lys Lys Asp Lys His Arg Pro Val Met Pro Pro Met Gly Ser 
675 680 685 

Gly Asn Ser Ser Ser Gly Ser Gly Ser Ser Lys Pro Met Leu Pro Pro 
690 695 700 

His Lys Lys Gin lie Pro His Gly Gly Asp Leu Leu Thr Asn Pro Gly 
705 710 715 720 

Glu Ser Gly Ser Leu Lys Arg Pro Asn Glu lie Ser Gly Ser Gin Tyr 
725 730 735 



Gly Leu Asn Lys Leu Asp Glu lie Asp Asn Ser Asn Met Pro Arg Glu 
740 745 750 



Lys Leu Arg Lys Leu Asp Thr Thr Thr Gly Leu Pro Thr Tyr Pro Asn 
755 760 765 



Tyr Glu Glu Lys His Thr Pro Leu Asn Met Ser Asn Gly lie Glu Thr 
770 775 780 



Thr Pro Asp Leu Val Arg Ser Leu Leu Lys Glu Ser Leu Cys Pro Ser 
785 790 795 800 

Asn Ala Ser Leu Leu Lys Pro Asp Ala Leu Thr Met Pro Gly Leu Lys 
805 810 815 

Pro Pro Ala Glu Leu Leu Glu Pro Met Pro Ala Pro Ala Thr lie Lys 
820 825 830 

Lys Glu Gin Gly lie Thr Pro Met Thr Ser Leu Ala Ser Gly Pro Ala 
835 840 845 

Pro Met Asp Leu Glu Val Pro Thr Lys Gin Ala Gly Glu lie Lys Glu 
850 855 860 

Glu Ser Ser Ser Lys Ser Glu Lys Lys Lys Lys Lys Asp Lys His Lys 
865 870 875 880 

His Lys Glu Lys Asp Lys Ser Lys Asp Lys Thr Glu Lys Glu Glu Arg 
885 890 895 

Lys Lys His Lys Arg Asp Lys Gin Lys Asp Arg Ser Gly Ser Gly Gly 
900 905 910 

Ser Lys Asp Ser Ser Leu Pro Asn Glu Pro Leu Lys Met Val lie Lys 
915 920 925 

Asn Pro Asn Gly Ser Leu Gin Ala Gly Ala Ser Ala Pro lie Lys Leu 
930 935 940 

Lys lie Ser Lys Asn Lys Val Glu Pro Asn Asn Tyr Ser Ala Ala Ala 
945 950 955 960 

Gly Leu Pro Gly Ala lie Gly Tyr Gly Leu Pro Pro Thr Thr Ala Thr 
965 970 975 

Thr Thr Ser Ala Ser lie Gly Ala Ala Ala Pro Val Leu Pro Pro Tyr 
980 985 990 

Gly Ala Gly Gly Gly Gly Tyr Ser Ser Ser Gly Gly Ser Ser Ser Gly 
995 1000 1005 

Gly Ser Ser Lys Lys Lys His Ser Asp Arg Asp Arg Asp Lys Glu Ser 
1010 1015 1020 

Lys Lys Asn Lys Ser Gin Asp Tyr Ala Lys Tyr Asn Gly Ala Gly Gly 
1025 1030 1035 1040 

Gly lie Phe Asn Pro Leu Gly Gly Ala Gly Ala Ala Pro Asn Met Ser 
1045 1050 1055 



Gly Gly Met Gly Ala Pro Met Ser Thr Ala Val Pro Pro Ser Met Leu 
1060 1065 1070 



Leu Ala Pro Thr Gly Ala Val Pro Pro Ser Ala Ala Gly Leu Ala Pro 
1075 1080 1085 



Pro Pro Met Pro Val Tyr Asn Lys Lys 
1090 1095 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1..1116 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATG GCA AAG CAG TAC GAC TCG GTG GAG TGC CCT TTT TGT GAT GAA GTT 4 8 

Met Ala Lys Gin Tyr Asp Ser Val Glu Cys Pro Phe Cys Asp Glu Val 
15 10 15 

TCC AAA TAC GAG AAG CTC GCC AAG ATC GGC CAA GGC ACC TTC GGG GAG 96 
Ser Lys Tyr Glu Lys Leu Ala Lys lie Gly Gin Gly Thr Phe Gly Glu 
20 25 30 

GTG TTC AAG GCC AGG CAC CGC AAG ACC GGC CAG AAG GTG GCT CTG AAG 14 4 

Val Phe Lys Ala Arg His Arg Lys Thr Gly Gin Lys Val Ala Leu Lys 
35 40 45 

AAG GTG CTG ATG GAA AAC GAG AAG GAG GGG TTC CCC ATT ACA GCC TTG 192 
Lys Val Leu Met Glu Asn Glu Lys Glu Gly Phe Pro lie Thr Ala Leu 
50 55 60 

CGG GAG ATC AAG ATC CTT CAG CTT CTA AAA CAC GAG AAT GTG GTC AAC 24 0 

Arg Glu lie Lys lie Leu Gin Leu Leu Lys His Glu Asn Val Val Asn 
65 70 75 80 

TTG ATT GAG ATT TGT CGA ACC AAA GCT TCC CCC TAT AAC CGC TGC AAG 288 
Leu lie Glu lie Cys Arg Thr Lys Ala Ser Pro Tyr Asn Arg Cys Lys 
85 90 95 

GGT AGT ATA TAC CTG GTG TTC GAC TTC TGC GAG CAT GAC CTT GCT GGG 336 
Gly Ser lie Tyr Leu Val Phe Asp Phe Cys Glu His Asp Leu Ala Gly 
100 105 110 

CTG TTG AGC AAT GTT TTG GTC AAG TTC ACG CTG TCT GAG ATC AAG AGG 384 
Leu Leu Ser Asn Val Leu Val Lys Phe Thr Leu Ser Glu lie Lys Arg 
115 120 125 

GTG ATG CAG ATG CTG CTT AAC GGC CTC TAC TAC ATC CAC AGA AAC AAG 432 
Val Met Gin Met Leu Leu Asn Gly Leu Tyr Tyr lie His Arg Asn Lys 
130 135 140 



ATC CTG CAT 
lie Leu His 
145 

GGG GTC CTG 
Gly Val Leu 



GCC AAG AAC 
Ala Lys Asn 



TGG TAC CGG 
Trp Tyr Arg 
195 

CCC ATT GAC 
Pro lie Asp 
210 

CGC AGC CCC 
Arg Ser Pro 
225 

ATC AGT CAG 
lie Ser Gin 



GAC AAC TAT 
Asp Asn Tyr 



CGG AAG GTG 
Arg Lys Val 
275 

CTG GAC CTC 
Leu Asp Leu 
290 

GAC AGC GAT 
Asp Ser Asp 
305 

CCC TCC GAC 
Pro Ser Asp 



GAG TAC TTG 
Glu Tyr Leu 



TCC ACC AAC 
Ser Thr Asn 
355 

GAG CGC GTC 



AGG GAC ATG 
Arg Asp Met 
150 

AAG CTG GCA 
Lys Leu Ala 
165 

AGC CAG CCC 
Ser Gin Pro 
180 

CCC CCG GAG 
Pro Pro Glu 



CTG TGG GGT 
Leu Trp Gly 



ATC ATG CAG 
He Met Gin 
230 

CTC TGC GGC 
Leu Cys Gly 
245 

GAG CTG TAC 
Glu Leu Tyr 
260 

AAG GAC AGG 
Lys Asp Arg 



ATC GAC AAG 
He Asp Lys 



GAC GCC CTC 
Asp Ala Leu 
310 

CTC AAG GGC 
Leu Lys Gly 
325 

GCA CCA CCG 
Ala Pro Pro 
340 

CAG AGT CGC 
Gin Ser Arg 



TTC TGA 



AAG GCT GCT 
Lys Ala Ala 



GAC TTT GGG 
Asp Phe Gly 



AAC CGC TAC 
Asn Arg Tyr 
185 

CTG TTG CTC 
Leu Leu Leu 
200 

GCT GGG TGC 
Ala Gly Cys 
215 

GGC AAC ACG 
Gly Asn Thr 



TCC ATC ACC 
Ser He Thr 



GAA AAG CTG 
Glu Lys Leu 
265 

CTG AAG GCC 
Leu Lys Ala 
280 

CTG CTG GTG 
Leu Leu Val 
295 

AAC CAC GAC 
Asn His Asp 



ATG CTC TCC 
Met Leu Ser 



CGC CGG AAG 
Arg Arg Lys 
345 

AAT CCC GCC 
Asn Pro Ala 
360 



AAT GTG CTT 
Asn Val Leu 
155 

CTG GCC CGG 
Leu Ala Arg 
170 

ACC AAC CGT 
Thr Asn Arg 



GGG GAG CGG 
Gly Glu Arg 



ATC ATG GCA 
He Met Ala 
220 

GAG CAG CAC 
Glu Gin His 
235 

CCT GAG GTG 
Pro Glu Val 
250 

GAG CTG GTC 
Glu Leu Val 



TAT GTG CGT 
Tyr Val Arg 



CTG GAC CCT 
Leu Asp Pro 
300 

TTC TTC TGG 
Phe Phe Trp 
315 

ACC CAC CTG 
Thr His Leu 
330 

GGC AGC CAG 
Gly Ser Gin 



ACC ACC AAC 
Thr Thr Asn 



ATC ACT CGT 
He Thr Arg 



GCC TTC AGC 
Ala Phe Ser 
175 

GTG GTG ACA 
Val Val Thr 
190 

GAC TAC GGC 
Asp Tyr Gly 
205 

GAG ATG TGG 
Glu Met Trp 



CAA CTC GCC 
Gin Leu Ala 



TGG CCA AAC 
Trp Pro Asn 
255 

AAG GGC CAG 
Lys Gly Gin 
270 

GAC CCA TAC 
Asp Pro Tyr 
285 

GCC CAG CGC 
Ala Gin Arg 



TCC GAC CCC 
Ser Asp Pro 



ACG TCC ATG 
Thr Ser Met 
335 

ATC ACC CAG 
He Thr Gin 
350 

CAG ACG GAG 
Gin Thr Glu 
365 



GAT 4 80 

Asp 

160 

CTG 528 
Leu 



CTC 576 
Leu 



CCC 624 
Pro 



ACC 672 
Thr 



CTC 720 

Leu 

240 

GTG 7 68 

Val 



AAG 816 
Lys 



GCA 8 64 

Ala 



ATC 912 
He 



ATG 960 

Met 

320 

TTC 1008 
Phe 



CAG 1056 
Gin 



TTT 1104 
Phe 



1119 



Glu Arg Val Phe 
370 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Lys Gin Tyr Asp Ser Val Glu Cys Pro Phe Cys Asp Glu Val 
15 10 15 

Ser Lys Tyr Glu Lys Leu Ala Lys lie Gly Gin Gly Thr Phe Gly Glu 
20 25 30 

Val Phe Lys Ala Arg His Arg Lys Thr Gly Gin Lys Val Ala Leu Lys 
35 40 45 

Lys Val Leu Met Glu Asn Glu Lys Glu Gly Phe Pro lie Thr Ala Leu 

50 55 60 

Arg Glu lie Lys lie Leu Gin Leu Leu Lys His Glu Asn Val Val Asn 
65 70 75 80 

Leu lie Glu lie Cys Arg Thr Lys Ala Ser Pro Tyr Asn Arg Cys Lys 
85 90 95 

Gly Ser lie Tyr Leu Val Phe Asp Phe Cys Glu His Asp Leu Ala Gly 
100 105 110 

Leu Leu Ser Asn Val Leu Val Lys Phe Thr Leu Ser Glu lie Lys Arg 
115 120 125 

Val Met Gin Met Leu Leu Asn Gly Leu Tyr Tyr lie His Arg Asn Lys 
130 135 140 

lie Leu His Arg Asp Met Lys Ala Ala Asn Val Leu lie Thr Arg Asp 
145 150 155 160 

Gly Val Leu Lys Leu Ala Asp Phe Gly Leu Ala Arg Ala Phe Ser Leu 
165 170 175 

Ala Lys Asn Ser Gin Pro Asn Arg Tyr Thr Asn Arg Val Val Thr Leu 
180 185 190 

Trp Tyr Arg Pro Pro Glu Leu Leu Leu Gly Glu Arg Asp Tyr Gly Pro 
195 200 205 



Pro He Asp Leu Trp Gly Ala Gly Cys He Met Ala Glu Met Trp Thr 



210 



215 



220 



Arg Ser Pro lie Met Gin Gly Asn Thr Glu Gin His 
225 230 235 



Gin Leu Ala Leu 
240 



lie Ser Gin Leu 



Cys Gly Ser lie Thr 
245 



Pro Glu Val 
250 



Trp Pro Asn Val 
255 



Asp Asn Tyr Glu Leu Tyr Glu Lys Leu Glu Leu Val 
260 265 

Arg Lys Val Lys Asp Arg Leu Lys Ala Tyr Val Arg 
275 280 



Lys Gly Gin Lys 
270 

Asp Pro Tyr Ala 
285 



Leu Asp Leu lie Asp Lys Leu Leu Val Leu Asp Pro 
290 295 300 



Ala Gin Arg lie 



Asp Ser Asp Asp 
305 

Pro Ser Asp Leu 



Ala Leu Asn His Asp 
310 

Lys Gly Met Leu Ser 
325 



Phe Phe Trp 
315 

Thr His Leu 
330 



Ser Asp Pro Met 
320 

Thr Ser Met Phe 
335 



Glu Tyr Leu Ala Pro Pro Arg Arg Lys Gly Ser Gin 
340 345 



He Thr Gin Gin 
350 



Ser Thr Asn Gin Ser Arg Asn Pro Ala 
355 360 



Thr Thr Asn 



Gin Thr Glu Phe 
365 



Glu Arg 
370 



Val Phe 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ACGAATTCCA CACAATCCAA AGATC 25 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



CAGAATTCCT ATTGCCGATC CCCAGA 



26 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of(8, 14) 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A or C or G or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "Y = C or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of(17, 20) 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "R - A or G" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGAATTCNAT GYTNCARCAR CC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of(13, 16, 19, 22, 25) 
(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "R = A or G" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AACTGCAGTC CARAARAART CRTGRTT 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
TGTCAAGGAT CAAACCGGCT GTGAT 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
CGAATTCCAA GAAACGCATC GATGC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
AGACCTGCCA AATCGTGT 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
AGAAGGTGGA TCTGTAACCA TTCGT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



GGAATTCAGA TCTCGATCAG ATTCA 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TTACTACTCG AGCTACCAAA CCCGGTC 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TAAGCAAGCT TCTATGGCGC ACATGTCC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TTACTACTCG AGCTACCAAA CCCGGTC 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of (13, 16, 22) 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "Y = C or T" 



(ix) FEATURE : 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 17 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "W = A or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "S = C or G" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 19 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A or C or G or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GGAATTCTGG TAYTTYWSNA AYGA 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "Y - C or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) t LOCATION: 14 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "R - A or G" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of(17, 20) 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A or C or G or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

CGGGATCCTG YTCRAANGGN GGCAT 



(2) INFORMATION FOR SEQ ID NO: 21: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of (11, 14, 20) 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A or C or G or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 23 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "R - A or G" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CGGGATCCAA NGGNGGCATN CCRT 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
AT C AC G AC AC CACCAGAGCT GTTA 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CGAATTCAGA TCGTGAACGG GA 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
CGAATTCAGG CGCTAGCAAT G 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
GAAAGGCGTA GAACCGA 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
GCTGACCCAT TTCCTGTATC AGATAG 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
GGAATTCTTC TGCTTGGCGA AT 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GGGAATTCGA GGTTCTATAC ATAT 



(2) INFORMATION FOR SEQ ID NO:29: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CTGTGTGAAT GGAATCTGTG ATGTG 25 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TATCCCGGGT CATATGAGTC TCCTAGCC 28 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Leu Gin Gin Pro Ser Gly Ser Thr Pro Ser Asn Val 
1 " 5 10 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Ala Asp Thr Ala Leu Asn His Asp Phe Phe Trp Thr Asp Pro Met Pro 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Leu Gin Gin Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Asn His Asp Phe Phe Trp Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Ser Pro Glu Trp Pro Asp lie 
1 5 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Trp Tyr Phe Ser Asn Asp Gin Leu Ala Asn Ser Pro Ser Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 37: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Thr Val His Gly Met Pro Pro Phe Glu Gin Gin Leu Pro Tyr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Trp Tyr Phe Ser Asn Asp 
1 5 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Met Pro Pro Phe Glu Gin 
1 5 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

His Gly Met Pro Pro Phe 
1 5 



(2) INFORMATION FOR SEQ ID NO: 41: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GCAGGATCCA GAATTCCATA TGGCAAAGCA GTACGACTCG G 41 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
CAGTACTCGA GTTATCAGAA GACGCGCTCA AAC 33 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4528 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

GGGGGGGGGG GGGTGAATGA AGGAGCGGGC GGAGGAGGAA TTGTCATGGC GTCGGGCCGT 60 

GGAGCTTCTT CTCGCTGGTT CTTTACTCGG GAACAGCTGG AGAACACGCC GAGCCGCCGC 120 

TGCGGAGTGG AGGCGGATAA AGAGCTCTCG TGCCGCCAGC AGGCGGCCAA CCTCATCCAG 180 

GAGATGGGAC AGCGTCTCAA TGTCTCTCAG CTTACAATAA ACACTGCGAT TGTTTATATG 24 0 

CACAGGTTTT ATATGCACCA TTCTTTCACC AAATTCAACA AAAATATAAT ATCGTCTACT 300 

GCATTATTTT TGGCTGCAAA AGTGGAAGAA CAGGCTCGAA AACTTGAACA TGTTATCAAA 360 

GTAGCACATG CTTGTCTTCA TCCTCTAGAG CCACTGCTGG ATACTAAATG TGATGCTTAC 420 

CTTCAACAGA CTCAAGAACT GGTTATACTT GAAACCATAA TGCTACAAAC TCTAGGTTTT 4 80 

GAGATCACCA TTGAACACCC ACACACAGAT GTGGTGAAAT GTACCCAGTT AGTAAGAGCA 54 0 

AGCAAGGATT TGGCACAGAC ATCCTATTTC ATGGCTACCA ACAGTCTGCA TCTTACAACC 600 

TTCTGTCTTC AGTACAAACC AACAGTGATA GCATGTGTAT GCATTCATTT GGCTTGCAAA 660 



TGGTCCAATT GGGAGATCCC TGTATCAACT GATGGAAAGC ATTGGTGGGA ATATGTGGAT 720 



CCTACAGTTA CTCTAGAATT ATTAGATGAG CTAACACATG AGTTTCTACA AATATTGGAG 780 

AAAACGCCTA ATAGGTTGAA GAAGATTCGA AACTGGAGGG CTAATCAGGC AGCTAGGAAA 84 0 

CCAAAAGTAG ATGGACAGGT ATCAGAGACA CCACTTCTTG GTTCATCTTT GGTCCAGAAT 900 

TCCATTTTAG TAGATAGTGT CACTGGTGTG CCTACAAACC CAAGTTTTCA GAAACCATCT 960 

ACATCAGCAT TCCCTGCGCC AGTACCTCTA AATTCAGGAA ATATTTCTGT TCAAGACAGC 1020 

CATACATCTG ATAATTTGTC AATGCTAGCA ACAGGAATGC CAAGTACTTC ATACGGTTTA 1080 

TCATCACACC AGGAATGGCC TCAACATCAA GACTCAGCAA GGACAGAACA GCTATATTCA 114 0 

CAGAAACAGG AGACATCTTT GTCTGGTAGC CAGTACAACA TCAACTTCCA GCAGGGACCT 1200 

TCTATATCAC TGCATTCAGG ATTACATCAC AGACCTGACA AAATTTCAGA TCATTCTTCT 1260 

GTTAAGCAAG AATATACTCA TAAAGCAGGG AGCAGTAAAC ACCATGGGCC AATTTCCACT 1320 

ACTCCAGGAA TAATTCCTCA GAAAATGTCT TTAGATAAAT ATAGAGAAAA GCGTAAACTA 1380 

GAAACTCTTG ATCTCGATGT AAGGGATCAT TATATAGCTG CCCAGGTAGA ACAGCAGCAC 144 0 

AAACAAGGGC AGTCACAGGC AGCCAGCAGC AGTTCTGTTA CTTCTCCCAT TAAAATGAAA 1500 

ATACCTATCG CAAATACTGA AAAATACATG GCAGATAAAA AGGAAAAGAG TGGGTCACTG 1560 

AAATTACGGA TTCCAATACC ACCCACTGAT AAAAGCGCCA GTAAAGAAGA ACTGAAAATG 1620 

AAAATAAAAG TTTCTTCTTC AGAAAGACAC AGCTCTTCTG ATGAAGGCAG TGGGAAAAGC 1680 

AAACATTCAA GCCCACATAT TAG C AG AG AC CATAAGGAGA AGCACAAGGA GCATCCTTCA 1740 

AGCCGCCACC ACACCAGCAG CCACAAGCAT TCCCACTCGC ATAGTGGCAG CAGCAGCGGT 1800 

GGCAGTAAAC ACAGTGCCGA CGGAATACCA CCCACTGTTC TGAGGAGTCC TGTTGGCCTG 18 60 

AGCAGTGATG GCATTTCCTC TAGCTCCAGC TCTTCAAGGA AGAGGCTGCA TGTCAATGAT 1920 

GCATCTCACA ACCACCACTC CAAAATGAGC AAAAGTTCCA AAAGTTCAGG TGGGCTACGG 1980 

ACATCTCAGC ACCTCGTGAA ACTGGACAAG AAGCCAGTGG AGACCAACGG TCCTGATGCC 2040 

AATCACGAGT ACAGTACAAG CAGCCAGCAT AT GG ACT AC A AAGACACATT CGACATGCTG 2100 

GACTCACTGT TAAGTGCCCA AGGAATGAAC ATGTAATAAT TTGTTTAGGT CAATTTTTCC 2160 

TTTACTTTTT TAATTTAAAA ATTGTTAGAA TGGAAAAATT CCTTCTGATC TAGCAGTGGT 2220 

AACCCCTGCT GTTGCTGCCA CTGCTTCAAT ATTTGTAAGT GCTACTTTAT TCTTCATTCT 2280 

GAAAAGAAGA GATTATAGTA AACAAGTCTT TATCTCCACA TATGATAGTG TTATAAATAC 234 0 

TGTAAAGGCA TGGAAGGTGC AAAACTCAGT ATTTCTACAA TTGCAGCTAA GAACATTAGG 2400 



ATGAATGGCT GGCTGCTTCT AGGAATATAA GATGCCTCAA GCATTCATTA TTTATGATTT 24 60 

GAATACTGTA GCTATTTTTT GTTGCTTGGC TTTTGAATGA GTGTAAATTG TTTTCTTTTG 2520 

TGTATTTATA CTTGTATGTA TGATTTGCAT GTTTCAATGA TAAAGGGATA AAACAGTATA 2580 

CTGACAACTG TTTACAAGAA AGTGGAGAAA AT G TACT AC A TTTTGTATGT TTAGATATTA 2 64 0 

CCGTAAATAC TCAGGATTGG AGCTGCTTGT AAGTATAACA ATATACAGAA TACT T TAT TT 2700 

TATCTTGTCA GAGTTCCATC AC TAT CT AAA ACAAAGGTGC AATTTTTTAT GTTAACCTTA 27 60 

AATCTAGCCC TTACTGGAAG CCACTGATAG GGACATTCAC TACCAGATGT GTGCAGTGCA 2820 

GCAGATGGTC ATATAACACT GTGAGGCACT GAATTTTGCC TTCAGAGGTT CTGACCAGAT 2880 

TGGCTGCTGA AATAGCCCCT AACTTTCTGA AGGCTTGAAG AGGAAAAAAT AAAGTTTACA 294 0 

TACTCTTGAT GGAAGTGCAT TTAAATGTTT GTTGGCTTGT TGCAGTTCTA TGAAACAGAG 3000 

CTGTTAATAA TGGTTATGTG GATTACTGTG ATTTGAAAAC TAAATTCACA ATAACTTACC 3060 

TAGTAGAGAT TTAGTGAGTT GTTTCCTTTA AAGAATTTTA CACTACATAT TTTAATAGTA 3120 

AACAGGGTCA CTTTCCTTTA GCATTCAGAA TG AC AC CAT A TTCTTAAATA TACTCCTTCC 3180 

CTGAAGCGTG TTTGTGTGTG ATGCCATATT TCTTTTTCAG GTAAATGTAG TCTTCCTTAT 3240 

AAAAATGAAA TTAAACCTAT GCTCTCAATT CTTTTATATT CTAACAATAA ATAAAAAAGA 3300 

AAAGATTACT GACTGTGCAT TGTACCTGTA TTTATAGTTT ATGGTTATCA GAAGCTCTGT 3360 

AAGAAAGAAA AGGTCAGCTC CCAGGCAAAC CAGTAGTGGA GGTTTTACAT TTGTTTGCAC 34 20 

ATCTCAGTAT ATTTCTGTTG AGGTAAAGTT TGCACAGTCA TCTGACTTCT GATCAAGCAT 34 80 

TAGATTTTAA CTTGTTTAGA TTTTGTCTTA AACACCAGTA ATATGGCTCT TGTTTATCAG 3540 

CTAATCTTGA ATTTATTCTG TGGTAAATCT TTTGAGTTGC TGAGTATATT TGAGATTGAT 3600 

TGGATTCAAC CTCTTGTTGA ACTGAAAACT TAATTTTTTC TCTGTATTTT TGTTACAAAG 3660 

CCACTGATAC GTGCACAATT GTAATTAAGT ATGTTGCAGT TGTAAATATT AGAGTTTAAT 3720 

CTCATGCTCT ACCTTTATTT AGCAATTACC TAATTTGCCA GTAGCTTTAT AATTTTTAAA 3780 

GATAATTGTT CATTATTTTG TCAATGTTAT TTGAACTTGG GGTACTTAGG AGCCTCTTTG 384 0 

TAGGGACTGT GCCTAGGTAG CATGTCCTAA CATTTGTTCT GGTCTTGCAT AACTTCAGTA 3900 

TCTTTGTCAT TATATGTAAC TTTGTTGCTC TGTATGGCAT AATATTGTAT CCATAAACAT 3960 

GGTAATTTTG ATACAGTTAT ACTTTTACAG TGGTACATAA TCCAAGGACT AGTATAGAAT 4 020 

TAAGCTGAGT GCAAGATGAG GGAGGGAAGG GCTTTCTTGG TAATTTAGAT GTGAAACCTC 4 080 

TACAGAGCTA TCATGTAAAA ACTACATGAG GTGGTTGTGC TACTGTATAA TTGGGGGTGA 414 0 



TAATACCAGG AATTTTAATA AGATTTTGTA AAGAATATCC AGAAAAGTAG TGAACTTATT 4200 

TTCAGTAGGC ATAGAAAACA ATGTGAATAT TTAAGGTCTG TGACTATAGT TAAACTTCAC 4 260 

TAAGAATTTG CAGAATTGTT TTGAGATGTG TGAATAAAGG TAATTTTATT GAATCTTCAT 4320 

TGGTGCTAAT GTTGGACAGT TAAAAAGATA GCTAGTGTAT ATTGTTATGG GTCAGTACTT 4380 

ATTAGTACTT CCAAAATTGA ATTTGAAATG CTATGTATTC ACTTTTCACT CTGTAAATGT 4 44 0 

AATTCTTTAC AATGACTTTA TTTATTAAAG GGCAGCCAGT TGTCATTTGT AAAAAAAAAA 4500 

AAAAAAAAAA AAAGCGGCCG CTGAATTC 4 528 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2091 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

ATGGCGTCGG GCCGTGGAGC TTCTTCTCGC TGGTTCTTTA CTCGGGAACA GCTGGAGAAC 60 

ACGCCGAGCC GCCGCTGCGG AGTGGAGGCG GATAAAGAGC TCTCGTGCCG CCAGCAGGCG 120 

GCCAACCTCA TCCAGGAGAT GGGACAGCGT CTCAATGTCT CTCAGCTTAC AATAAACACT 180 

GCGATTGTTT AT AT GC AC AG GTTTTATATG CACCATTCTT TCACCAAATT CAACAAAAAT 24 0 

ATAATATCGT CTACTGCATT ATTTTTGGCT GCAAAAGTGG AAGAACAGGC TCGAAAACTT 300 

GAACATGTTA TCAAAGTAGC ACATGCTTGT CTTCATCCTC TAGAGCCACT GCTGGATACT 360 

AAATGTGATG CTTACCTTCA ACAGACTCAA GAACTGGTTA TACTTGAAAC CATAATGCTA 420 

CAAACTCTAG GTTTTGAGAT CACCATTGAA CACCCACACA CAGATGTGGT GAAATGTACC 480 

CAGTTAGTAA GAGCAAGCAA GGATTTGGCA CAGACATCCT ATTTCATGGC TACCAACAGT 54 0 

CTGCATCTTA CAACCTTCTG TCTTCAGTAC AAACCAACAG TGATAGCATG TGTATGCATT 600 

CATTTGGCTT GCAAATGGTC CAATTGGGAG ATCCCTGTAT CAACTGATGG AAAGCATTGG 660 

TGGGAATATG TGGATCCTAC AGTTACTCTA GAATTATTAG ATGAGCTAAC ACATGAGTTT 720 

CTACAAATAT TGGAGAAAAC GCCTAATAGG TTGAAGAAGA TTCGAAACTG GAGGGCTAAT 780 

CAGGCAGCTA GGAAACCAAA AGTAGATGGA CAGGTATCAG AGACACCACT TCTTGGTTCA 84 0 

TCTTTGGTCC AGAATTCCAT TTTAGTAGAT AGTGTCACTG GTGTGCCTAC AAACCCAAGT 900 

TTTCAGAAAC CATCTACATC AGCATTCCCT GCGCCAGTAC CTCTAAATTC AGGAAATATT 960 



TCTGTTCAAG ACAGCCATAC ATCTGATAAT TTGTCAATGC TAGCAACAGG AATGCCAAGT 1020 

ACTTCATACG GTTTATCATC ACACCAGGAA TGGCCTCAAC ATCAAGACTC AGCAAGGACA 1080 

GAACAGCTAT ATTCACAGAA ACAGGAGACA TCTTTGTCTG GTAGCCAGTA CAACATCAAC 114 0 

TTCCAGCAGG GACCTTCTAT ATCACTGCAT TCAGGATTAC AT C AC AG AC C TGACAAAATT 1200 

TCAGATCATT CTTCTGTTAA GCAAGAATAT ACTCATAAAG CAGGGAGCAG TAAACACCAT 1260 

GGGCCAATTT CCACTACTCC AGGAATAATT CCTCAGAAAA TGTCTTTAGA TAAATATAGA 1320 

GAAAAGCGTA AACTAGAAAC TCTTGATCTC GATGTAAGGG AT CAT TAT AT AGCTGCCCAG 1380 

GTAGAACAGC AGCACAAACA AGGGCAGTCA CAGGCAGCCA GCAGCAGTTC TGTTACTTCT 14 40 

CCCATTAAAA TGAAAATACC TATCGCAAAT ACTGAAAAAT ACATGGCAGA TAAAAAGGAA 1500 

AAGAGTGGGT CACTGAAATT ACGGATTCCA ATACCACCCA CTGATAAAAG CGCCAGTAAA 1560 

GAAGAACTGA AAATGAAAAT AAAAGTTTCT TCTTCAGAAA GACACAGCTC TTCTGATGAA 1620 

GGCAGTGGGA AAAGCAAACA TTCAAGCCCA CATATTAGCA GAGACCATAA GGAGAAGCAC 1680 

AAGGAGCATC CTTCAAGCCG CCACCACACC AGCAGCCACA AGCATTCCCA CTCGCATAGT 1740 

GGCAGCAGCA GCGGTGGCAG TAAACACAGT GCCGACGGAA TACCACCCAC TGTTCTGAGG 1800 

AGTCCTGTTG GCCTGAGCAG TGATGGCATT TCCTCTAGCT CCAGCTCTTC AAGGAAGAGG 1860 

CTGCATGTCA ATGATGCATC TCACAACCAC CACTCCAAAA TGAGCAAAAG TTCCAAAAGT 1920 

TCAGGTGGGC TACGGACATC TCAGCACCTC GTGAAACTGG ACAAGAAGCC AGTGGAGACC 1980 

AACGGTCCTG ATGCCAATCA CGAGTACAGT ACAAGCAGCC AGCATATGGA CTACAAAGAC 2040 

ACATTCGACA TGCTGGACTC ACTGTTAAGT GCCCAAGGAA TGAACATGTA A 2091 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 696 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Met Ala Ser Gly Arg Gly Ala Ser Ser Arg Trp Phe Phe Thr Arg Glu 
15 10 15 

Gin Leu Glu Asn Thr Pro Ser Arg Arg Cys Gly Val Glu Ala Asp Lys 
20 25 30 

Glu Leu Ser Cys Arg Gin Gin Ala Ala Asn Leu lie Gin Glu Met Gly 



35 



40 



45 



Gin Arg Leu Asn Val Ser Gin Leu Thr lie Asn Thr Ala lie Val Tyr 
50 55 60 

Met His Arg Phe Tyr Met His His Ser Phe Thr Lys Phe Asn Lys Asn 
65 70 75 80 

lie lie Ser Ser Thr Ala Leu Phe Leu Ala Ala Lys Val Glu Glu Gin 
85 90 95 

Ala Arg Lys Leu Glu His Val lie Lys Val Ala His Ala Cys Leu His 
100 105 110 

Pro Leu Glu Pro Leu Leu Asp Thr Lys Cys Asp Ala Tyr Leu Gin Gin 
115 120 125 

Thr Gin Glu Leu Val lie Leu Glu Thr lie Met Leu Gin Thr Leu Gly 
130 135 140 

Phe Glu He Thr He Glu His Pro His Thr Asp Val Val Lys Cys Thr 
145 150 155 160 

Gin Leu Val Arg Ala Ser Lys Asp Leu Ala Gin Thr Ser Tyr Phe Met 
165 170 175 

Ala Thr Asn Ser Leu His Leu Thr Thr Phe Cys Leu Gin Tyr Lys Pro 
180 185 190 

Thr Val He Ala Cys Val Cys He His Leu Ala Cys Lys Trp Ser Asn 
195 200 205 

Trp Glu He Pro Val Ser Thr Asp Gly Lys His Trp Trp Glu Tyr Val 
210 215 220 

Asp Pro Thr Val Thr Leu Glu Leu Leu Asp Glu Leu Thr His Glu Phe 
225 230 235 240 

Leu Gin lie Leu Glu Lys Thr Pro Asn Arg Leu Lys Lys He Arg Asn 
245 250 255 

Trp Arg Ala Asn Gin Ala Ala Arg Lys Pro Lys Val Asp Gly Gin Val 
260 265 270 

Ser Glu Thr Pro Leu Leu Gly Ser Ser Leu Val Gin Asn Ser He Leu 
275 280 285 

Val Asp Ser Val Thr Gly Val Pro Thr Asn Pro Ser Phe Gin Lys Pro 
290 295 300 

Ser Thr Ser Ala Phe Pro Ala Pro Val Pro Leu Asn Ser Gly Asn He 
305 310 315 320 



Ser Val Gin Asp Ser His Thr Ser Asp Asn Leu Ser Met Leu Ala Thr 
325 330 335 



Gly Met Pro Ser Thr Ser Tyr Gly Leu Ser Ser His Gin Glu Trp Pro 



340 



345 



350 



Gin His Gin Asp Ser Ala Arg Thr Glu Gin Leu Tyr Ser Gin Lys Gin 
355 360 365 

Glu Thr Ser Leu Ser Gly Ser Gin Tyr Asn lie Asn Phe Gin Gin Gly 
370 375 380 

Pro Ser lie Ser Leu His Ser Gly Leu His His Arg Pro Asp Lys lie 
385 390 395 400 

Ser Asp His Ser Ser Val Lys Gin Glu Tyr Thr His Lys Ala Gly Ser 
405 410 415 

Ser Lys His His Gly Pro lie Ser Thr Thr Pro Gly lie lie Pro Gin 
420 425 430 

Lys Met Ser Leu Asp Lys Tyr Arg Glu Lys Arg Lys Leu Glu Thr Leu 
435 440 445 

Asp Leu Asp Val Arg Asp His Tyr lie Ala Ala Gin Val Glu Gin Gin 
450 455 460 

His Lys Gin Gly Gin Ser Gin Ala Ala Ser Ser Ser Ser Val Thr Ser 
465 470 475 480 

Pro lie Lys Met Lys lie Pro lie Ala Asn Thr Glu Lys Tyr Met Ala 
485 490 495 

Asp Lys Lys Glu Lys Ser Gly Ser Leu Lys Leu Arg lie Pro lie Pro 
500 505 510 

Pro Thr Asp Lys Ser Ala Ser Lys Glu Glu Leu Lys Met Lys lie Lys 
515 520 525 

Val Ser Ser Ser Glu Arg His Ser Ser Ser Asp Glu Gly Ser Gly Lys 
530 535 540 

Ser Lys His Ser Ser Pro His lie Ser Arg Asp His Lys Glu Lys His 
545 550 555 . 560 

Lys Glu His Pro Ser Ser Arg His His Thr Ser Ser His Lys His Ser 
565 570 575 

His Ser His Ser Gly Ser Ser Ser Gly Gly Ser Lys His Ser Ala Asp 
580 585 590 

Gly lie Pro Pro Thr Val Leu Arg Ser Pro Val Gly Leu Ser Ser Asp 
595 600 605 

Gly lie Ser Ser Ser Ser Ser Ser Ser Arg Lys Arg Leu His Val Asn 
610 615 620 



Asp Ala Ser His Asn His His Ser Lys Met Ser Lys Ser Ser Lys Ser 
625 630 635 640 



Ser Gly Gly Leu Arg Thr Ser Gin His Leu Val Lys Leu Asp Lys Lys 



Pro Val Glu Thr Asn Gly Pro Asp Ala Asn His Glu Tyr Ser Thr Ser 
660 665 670 

Ser Gin His Met Asp Tyr Lys Asp Thr Phe Asp Met Leu Asp Ser Leu 
675 680 685 

Leu Ser Ala Gin Gly Met Asn Met 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2190 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

ATGGCGTCGG GCCGTGGAGC TTCTTCTCGC TGGTTCTTTA CTCGGGAACA GCTGGAGAAC 60 

ACGCCGAGCC GCCGCTGCGG AGTGGAGGCG GATAAAGAGC TCTCGTGCCG CCAGCAGGCG 120 

GCCAACCTCA TCCAGGAGAT GGGACAGCGT CTCAATGTCT CTCAGCTTAC AATAAACACT 180 

GCGATTGTTT AT AT GC AC AG GTTTTATATG CACCATTCTT TCACCAAATT CAACAAAAAT 240 

ATAATATCGT CTACTGCATT ATTTTTGGCT GCAAAAGTGG AAGAACAGGC TCGAAAACTT 300 

GAACATGTTA TCAAAGTAGC ACATGCTTGT CTTCATCCTC TAGAGCCACT GCTGGATACT 360 

AAATGTGATG CTTACCTTCA ACAGACTCAA GAACTGGTTA TACTTGAAAC CATAATGCTA 420 

CAAACTCTAG GTTTTGAGAT CACCATTGAA CACCCACACA CAGATGTGGT GAAATGTACC 4 80 

CAGTTAGTAA GAGCAAGCAA GGATTTGGCA CAGACATCCT ATTTCATGGC TACCAACAGT 54 0 

CTGCATCTTA CAACCTTCTG TCTTCAGTAC AAACCAACAG TGATAGCATG TGTATGCATT 600 

CATTTGGCTT GCAAATGGTC CAATTGGGAG ATCCCTGTAT CAACTGATGG AAAGCATTGG 660 

TGGGAATATG TGGATCCTAC AGTTACTCTA GAATTATTAG ATGAGCTAAC ACATGAGTTT 720 

CTACAAATAT TGGAGAAAAC GCCTAATAGG TTGAAGAAGA TTCGAAACTG GAGGGCTAAT 780 

CAGGCAGCTA GGAAACCAAA AGTAGATGGA CAGGTATCAG AGACACCACT TCTTGGTTCA 84 0 

TCTTTGGTCC AGAATTCCAT TTTAGTAGAT AGTGTCACTG GTGTGCCTAC AAACCCAAGT 900 

TTTCAGAAAC CATCTACATC AGCATTCCCT GCGCCAGTAC CTCTAAATTC AGGAAATATT 960 

TCTGTTCAAG ACAGCCATAC ATCTGATAAT TTGTCAATGC TAGCAACAGG AATGCCAAGT 1020 

ACTTCATACG GTTTATCATC ACACCAGGAA TGGCCTCAAC ATCAAGACTC AGCAAGGACA 1080 



GAACAGCTAT ATTCACAGAA ACAGGAGACA TCTTTGTCTG GTAGCCAGTA CAACATCAAC 1140 

TTCCAGCAGG GACCTTCTAT ATCACTGCAT TCAGGATTAC ATCACAGACC TGACAAAATT 1200 

TCAGATCATT CTTCTGTTAA GCAGGAATAT ACTCATAAAG CAGGGAGCAG TAAACACCAT 12 60 

GGGCCAATTT CCACTACTCC AGGAATAATT CCTCAGAAAA TGTCTTTAGA TAAATATAGA 1320 

GAAAAGCGTA AACTAGAAAC TCTTGATCTC GATGTAAGGG ATCATTATAT AGCTGCCCAG 1380 

GTAGAACAGC AGCACAAACA AGGGCAGTCA CAGGCAGCCA GCAGCAGTTC TGTTACTTCT 14 4 0 

CCCATTAAAA TGAAAATACC TATCGCAAAT ACTGAAAAAT ACATGGCAGA TAAAAAGGAA 1500 

AAGAGTGGGT CACTGAAATT AC G GAT T CCA ATACCACCCA CTGATAAAAG CGCCAGTAAA 1560 

GAAGAACTGA AAATGAAAAT AAAAGTTTCT TCTTCAGAAA GACACAGCTC TTCTGATGAA 1620 

GGCAGTGGGA AAAGCAAACA TTCAAGCCCA CATATTAGCA GAGACCATAA GGAGAAGCAC 1680 

AAG GAG CATC CTTCAAGCCG CCACCACACC AGCAGCCACA AGCATTCCCA CTCGCATAGT 174 0 

GGCAGCAGCA GCGGTGGCAG TAAACACAGT GCCGACGGAA TACCACCCAC TGTTCTGAGG 1800 

AGTCCTGTTG GCCTGAGCAG TGATGGCATT TCCTCTAGCT CCAGCTCTTC AAGGAAGAGG 1860 

CTGCATGTCA ATGATGCATC TCACAACCAC CACTCCAAAA TGAGCAAAAG TTCCAAAAGT 1920 

TCAGGTAGTT CATCTAGTTC TTCCTCCTCT GTTAAGCAGT ATATATCCTC TCACAACTCT 1980 

GTTTTTAACC ATCCCTTACC CCTCCTCCCC TGTCACATAC CAGGTGGGCT ACGGACATCT 2040 

CTGCACCTCG TGAAACTGGA CAAGAAGCCA GTGGAGACCA ACGGTCCTGA TGCCAATCAC 2100 

GAGTACAGTA CAAGCAGCCA GCATATGGAC TACAAAGACA CATTCGACAT GCTGGACTCA 2160 

CTGTTAAGTG CCCAAGGAAT GAACATGTAA 2190 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Met Ala Ser Gly Arg Gly Ala Ser Ser Arg Trp Phe Phe Thr Arg Glu 
15 10 15 

Gin Leu Glu Asn Thr Pro Ser Arg Arg Cys Gly Val Glu Ala Asp Lys 
20 25 30 



Glu Leu Ser Cys Arg Gin Gin Ala Ala Asn Leu lie Gin Glu Met Gly 



35 



40 



45 



Gin Arg Leu Asn Val Ser Gin Leu Thr lie Asn Thr Ala lie Val Tyr 
50 55 60 

Met His Arg Phe Tyr Met His His Ser Phe Thr Lys Phe Asn Lys Asn 
65 70 75 80 

lie lie Ser Ser Thr Ala Leu Phe Leu Ala Ala Lys Val Glu Glu Gin 
85 90 95 

Ala Arg Lys Leu Glu His Val lie Lys Val Ala His Ala Cys Leu His 
100 105 110 

Pro Leu Glu Pro Leu Leu Asp Thr Lys Cys Asp Ala Tyr Leu Gin Gin 
115 120 125 

Thr Gin Glu Leu Val lie Leu Glu Thr lie Met Leu Gin Thr Leu Gly 
130 135 140 

Phe Glu He Thr He Glu His Pro His Thr Asp Val Val Lys Cys Thr 
145 150 155 160 

Gin Leu Val Arg Ala Ser Lys Asp Leu Ala Gin Thr Ser Tyr Phe Met 
165 170 175 

Ala Thr Asn Ser Leu His Leu Thr Thr Phe Cys Leu Gin Tyr Lys Pro 
180 185 190 

Thr Val He Ala Cys Val Cys He His Leu Ala Cys Lys Trp Ser Asn 
195 200 205 

Trp Glu lie Pro Val Ser Thr Asp Gly Lys His Trp Trp Glu Tyr Val 
210 215 220 

Asp Pro Thr Val Thr Leu Glu Leu Leu Asp Glu Leu Thr His Glu Phe 

225 230 235 240 

Leu Gin lie Leu Glu Lys Thr Pro Asn Arg Leu Lys Lys lie Arg Asn 
245 250 255 

Trp Arg Ala Asn Gin Ala Ala Arg Lys Pro Lys Val Asp Gly Gin Val 
260 265 270. 

Ser Glu Thr Pro Leu Leu Gly Ser Ser Leu Val Gin Asn Ser He Leu 
275 280 285 

Val Asp Ser Val Thr Gly Val Pro Thr Asn Pro Ser Phe Gin Lys Pro 
290 295 300 

Ser Thr Ser Ala Phe Pro Ala Pro Val Pro Leu Asn Ser Gly Asn lie 
305 310 315 320 



Ser Val Gin Asp Ser His Thr Ser Asp Asn Leu Ser Met Leu Ala Thr 
325 330 335 



Gly Met Pro Ser Thr 
340 

Gin His Gin Asp Ser 
355 

Glu Thr Ser Leu Ser 
370 

Pro Ser lie Ser Leu 
385 

Ser Asp His Ser Ser 
405 

Ser Lys His His Gly 
420 

Lys Met Ser Leu Asp 
435 

Asp Leu Asp Val Arg 
450 

His Lys Gin Gly Gin 
465 

Pro lie Lys Met Lys 
485 

Asp Lys Lys Glu Lys 
500 

Pro Thr Asp Lys Ser 
515 

Val Ser Ser Ser Glu 
530 

Ser Lys His Ser Ser 
545 

Lys Glu His Pro Ser 
565 

His Ser His Ser Gly 
580 

Gly lie Pro Pro Thr 
595 

Gly lie Ser Ser Ser 
610 



Ser Tyr Gly Leu Ser Ser 
345 

Ala Arg Thr Glu Gin Leu 
360 

Gly Ser Gin Tyr Asn He 
375 

His Ser Gly Leu His His 
390 395 

Val Lys Gin Glu Tyr Thr 
410 

Pro He Ser Thr Thr Pro 
425 

Lys Tyr Arg Glu Lys Arg 
440 

Asp His Tyr He Ala Ala 
455 

Ser Gin Ala Ala Ser Ser 
470 475 

He Pro He Ala Asn Thr 

490 

Ser Gly Ser Leu Lys Leu 
505 

Ala Ser Lys Glu Glu Leu 
520 

Arg His Ser Ser Ser Asp 
535 

Pro His He Ser Arg Asp 
550 555 

Ser Arg His His Thr Ser 
570 

Ser Ser Ser Gly Gly Ser 
585 

Val Leu Arg Ser Pro Val 
600 

Ser Ser Ser Ser Arg Lys 
615 



His Gin Glu Trp Pro 
350 

Tyr Ser Gin Lys Gin 
365 

Asn Phe Gin Gin Gly 
380 

Arg Pro Asp Lys He 
400 

His Lys Ala Gly Ser 
415 

Gly He He Pro Gin 
430 

Lys Leu Glu Thr Leu 
445 

Gin Val Glu Gin Gin 
460 

Ser Ser Val Thr Ser 
480 

Glu Lys Tyr Met Ala 
4 95 

Arg He Pro He Pro 
510 

Lys Met Lys He Lys 
525 

Glu Gly Ser Gly Lys 
540 

His Lys Glu Lys His 
560 

Ser His Lys His Ser 
575 

Lys His Ser Ala Asp 
590 

Gly Leu Ser Ser Asp 
605 

Arg Leu His Val Asn 
620 



Asp Ala Ser His Asn His His Ser Lys Met Ser Lys Ser Ser Lys Ser 
625 630 635 640 



Ser Gly Ser Ser Ser Ser Ser Ser Ser Ser Val Lys Gin Tyr lie Ser 
645 650 655 

Ser His Asn Ser Val Phe Asn His Pro Leu Pro Leu Leu Pro Cys His 
660 665 670 

lie Pro Gly Gly Leu Arg Thr Ser Gin His Leu Val Lys Leu Asp Lys 
675 680 685 

Lys Pro Val Glu Thr Asn Gly Pro Asp Ala Asn His Glu Tyr Ser Thr 
690 695 700 

Ser Ser Gin His Met Asp Tyr Lys Asp Thr Phe Asp Met Leu Asp Ser 
705 710 715 720 

Leu Leu Ser Ala Gin Gly Met Asn Met 
725 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GGAAGTGCCT GCAACCTTCG CCGCTGCCTT CTGGTTGAAG C AC TAT GG AG GGAGAGAGGA 60 

AGAACAACAA CAAACGGTGG TATTTCACTC GAGAACAGCT GGAAAATAGC CCATCCCGTC 120 

GTTTTGGCGT GGACCCAGAT AAAGAACTTT CTTATCGCCA GCAGGCGGCC AATCTGCTTC 180 

AGGACATGGG ' GCAGCGTCTT AACGTCTCAC AATTGACTAT CAACACTGCT ATAGTATACA 240 

TGCATCGATT CTACATGATT CAGTCCTTCA CACGGTTCCC TGGAAATTCT GTGGCTCCAG 300 

CAGCCTTGTT TCTAGCAGCT AAAGTGGAGG AGCAGCCCAA AAAATTGGAA CATGTCATCA 360 

AGGTAGCACA TACTTGTCTC CATCCTCAGG AATCCCTTCC T GAT AC TAG A AGTGAGGCTT 4 20 

ATTTGCAACA AGTTCAAGAT CTGGTCATTT TAGAAAGCAT AATTTTGCAG ACTTTAGGCT 480 

TTGAACTAAC AATTGATCAC CCACATACTC ATGTAGTAAA GTGCACTCAA CTTGTTCGAG 54 0 

CAAGCAAGGA CTTAGCACAG ACTTCTTACT TCATGGCAAC CAACAGCCTG CATTTGACCA 600 

CATTTAGCCT GCAGTACACA CCTCCTGTGG TGGCCTGTGT CTGCATTCAC CTGGCTTGCA 660 

AGTGGTCCAA TTGGGAGATC CCAGTCTCAA CTGACGGGAA GCACTGGTGG GAGTATGTTG 720 

ACGCCACTGT GACCTTGGAA CTTTTAGATG AACTGACACA TGAGTTTCTA CAGATTTTGG 780 

AGAAAACTCC CAACAGGCTC AAACGCATTT GGAATTGGAG GGCATGCGAG GCTGCCAAGA 840 



AAACAAAAGC AGATGACCGA GGAACAGATG AAAAGACTTC AGAGCAGACA ATCCTCAATA 900 

TGATTTCCCA GAGCTCTTCA GACACAACCA TTGCAGGTTT AATGAGCATG TCAACTTCTA 960 

CCACAAGTGC AGTGCCTTCC CTGCCAGTCT CCGAAGAGTC ATCCAGCAAC TTAACCAGTG 1020 

TGGAGATGTT GCCGGGCAAG CGTTGGCTGT CCTCCCAACC TTCTTTCAAA CTAGAACCTA 1080 

CTCAGGGTCA TCGGACTAGT GAGAATTTAG CACTTACAGG AGTTGATCAT TCCTTACCAC 1140 

AGGATGGTTC AAATGCATTT ATTTCCCAGA AGCAGAATAG TAAGAGTGTG CCATCAGCTA 1200 

AAGTGTCACT GAAAGAATAC CGCGCGAAGC ATGCAGAAGA ATTGGCTGCC CAGAAGAGGC 1260 

AACTGGAGAA CATGGAAGCC AATGTGAAGT CACAATATGC ATATGCTGCC CAGAATCTCC 1320 

TTTCTCATCA TGATAGCCAT TCTTCAGTCA TTCTAAAAAT GCCCATAGAG GGTTCAGAAA 1380 

ACCCCGAGCG GCCTTTTCTG GAAAAGGCTG ACAAAACAGC TCTCAAAATG AGAATCCCAG 14 40 

TGGCAGGTGG AGATAAAGCT GCGTCTTCAA AACCAGAGGA GATAAAAATG CGCATAAAAG 1500 

TCCATGCTGC AGCTGATAAG CACAATTCTG TAGAGGACAG TGTTACAAAG AGCCGAGAGC 1560 

ACAAAGAAGA GCGCAAGACT CACCCATCTA AT CAT CATC A TCATCATAAT CACCACTCAC 1620 

ACAAGCACTC TCATTCCCAA CTTCCAGTTG GTACTGGGAA CAAACGTCCT GGTGATCCAA 1680 

AACATAGTAG CCAGACAAGC AACTTAGCAC ATAAAACCTA TAGCTTGTCT AGTTCTTTTT 174 0 

CCTCTTCCAG TTCTACTCGT AAAAGGGGAC CCTCTGAAGA GACTGGAGGG GCTGTGTTTG 1800 

ATCATCCAGC CAAGATTGCC AAGAGTACTA AATCCTCTTC CCTAAATTTC TCCTTCCCTT 18 60 

CACTTCCTAC AATGGGTCAG ATGCCTGGGC ATAGCTCAGA CACAAGTGGC CTTTCCTTTT 1920 

CACAGCCCAG CTGTAAAACT CGTGTCCCTC ATTCGAAACT GGATAAAGGG CCCACTGGGG 1980 

CCAATGGTCA CAACACGACC CAGACAATAG ACTATCAAGA CACTGTGAAT ATGCTTCACT 2040 

CCCTGCTCAG TGCCCAGGGT GTTCAGCCCA CTCAGCCCAC TGCATTTGAA TTTGTTCGTC 2100 

CTTATAGTGA CTATCTGAAT CCTCGGTCTG GTGGAATCTC CTCGAGATCT GGCAATACAG 2160 

ACAAACCCCG GCCACCACCT CTGCCATCAG AACCTCCTCC ACCACTTCCA CCCCTTCCTA 2220 

AGTAAAAAAA GAAAAAGAAG AGGAGAAAAA AACTTCTTTA AAAAAACACA TAATTTTTCT 2280 

TTTTTTTTTG GGGAAAAAAA AATTTTTTTT AAAATTTTTT CCCCAAGGGA CGGGGGAAAA 234 0 

TTTTATTTTT AAAATTTTTT 2360 



(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 2181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AT G GAG G GAG AGAGGAAGAA CAACAACAAA CGGTGGTATT TCACTCGAGA ACAGCTGGAA 60 

AATAGCCCAT CCCGTCGTTT TGGCGTGGAC CCAGATAAAG AACTTTCTTA TCGCCAGCAG 120 

GCGGCCAATC TGCTTCAGGA CATGGGGCAG CGTCTTAACG TCTCACAATT GACTATCAAC 180 

ACTGCTATAG TATACATGCA TCGATTCTAC ATGATTCAGT CCTTCACACG GTTCCCTGGA 240 

AATTCTGTGG CTCCAGCAGC CTTGTTTCTA GCAGCTAAAG TGGAGGAGCA GCCCAAAAAA 300 

TTGGAACATG TCATCAAGGT AG C AC AT ACT TGTCTCCATC CTCAGGAATC CCTTCCTGAT 360 

ACTAGAAGTG AGGCTTATTT GCAACAAGTT CAAGATCTGG TCATTTTAGA AAGCATAATT 4 20 

TTGCAGACTT TAGGCTTTGA ACTAACAATT GATCACCCAC ATACTCATGT AGTAAAGTGC 4 80 

ACTCAACTTG TTCGAGCAAG CAAGGACTTA GCACAGACTT CTTACTTCAT GGCAACCAAC 540 

AGCCTGCATT TGACCACATT TAGCCTGCAG TACACACCTC CTGTGGTGGC CTGTGTCTGC 600 

ATTCACCTGG CTTGCAAGTG GTCCAATTGG GAGATCCCAG TCTCAACTGA CGGGAAGCAC 660 

TGGTGGGAGT ATGTTGACGC CACTGTGACC TTGGAACTTT TAGATGAACT G AC AC AT GAG 720 

TTTCTACAGA TTTTGGAGAA AACTCCCAAC AGGCTCAAAC GCATTTGGAA TTGGAGGGCA 780 

TGCGAGGCTG CCAAGAAAAC AAAAGCAGAT GACCGAGGAA CAGATGAAAA GACTTCAGAG 840 

CAGACAATCC TCAATATGAT TTCCCAGAGC TCTTCAGACA CAACCATTGC AGGTTTAATG 900 

AGCATGTCAA CTTCTACCAC AAGTGCAGTG CCTTCCCTGC CAGTCTCCGA AGAGTCATCC 960 

AGCAACTTAA CCAGTGTGGA GATGTTGCCG GGCAAGCGTT GGCTGTCCTC CCAACCTTCT 1020 

TTCAAACTAG AACCTACTCA GGGTCATCGG AC TAG T GAGA ATTTAGCACT TACAGGAGTT 1080 

GATCATTCCT TACCACAGGA TGGTTCAAAT GCATTTATTT CCCAGAAGCA GAATAGTAAG 1140 

AGTGTGCCAT CAGCTAAAGT GTCACTGAAA GAATACCGCG CGAAGCATGC AGAAGAATTG 1200 

GCTGCCCAGA AGAGGCAACT GGAGAACATG GAAGCCAATG TGAAGTCACA ATATGCATAT 1260 

GCTGCCCAGA ATCTCCTTTC TCATCATGAT AGCCATTCTT CAGTCATTCT AAAAATGCCC 1320 

ATAGAGGGTT CAGAAAACCC CGAGCGGCCT TTTCTGGAAA AGGCTGACAA AACAGCTCTC 1380 

AAAATGAGAA TCCCAGTGGC AGGTGGAGAT AAAGCTGCGT CTTCAAAACC AGAGGAGATA 14 4 0 

AAAATGCGCA TAAAAGTCCA TGCTGCAGCT GATAAGCACA ATTCTGTAGA GGACAGTGTT 1500 



ACAAAGAGCC GAGAGCACAA AGAAGAGCGC AAGACTCACC CATCTAATCA TCATCATCAT 1560 

CATAATCACC ACTCACACAA GCACTCTCAT TCCCAACTTC CAGTTGGTAC TGGGAACAAA 1620 

CGTCCTGGTG ATCCAAAACA TAG TAG CC AG ACAAGCAACT TAGCACATAA AACCTATAGC 1680 

TTGTCTAGTT CTTTTTCCTC TTCCAGTTCT ACTCGTAAAA GGGGACCCTC TGAAGAGACT 174 0 

GGAGGGGCTG TGTTTGATCA TCCAGCCAAG ATTGCCAAGA GTACTAAATC CTCTTCCCTA 1800 

AATTTCTCCT TCCCTTCACT TCCTACAATG GGTCAGATGC CTGGGCATAG CTCAGACACA 1860 

AGTGGCCTTT CCTTTTCACA GCCCAGCTGT AAAACTCGTG TCCCTCATTC GAAACTGGAT 1920 

AAAGGGCCCA CTGGGGCCAA TGGTCACAAC ACGACCCAGA CAATAGACTA TCAAGACACT 1980 

GTGAATATGC TTCACTCCCT GCTCAGTGCC CAGGGTGTTC AGCCCACTCA GCCCACTGCA 2040 

TTTGAATTTG TTCGTCCTTA TAG TG AC TAT CTGAATCCTC GGTCTGGTGG AATCTCCTCG 2100 

AGATCTGGCA ATACAGACAA ACCCCGGCCA CCACCTCTGC CATCAGAACC TCCTCCACCA 2160 

CTTCCACCCC TTCCTAAGTA A 2181 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Glu Gly Glu Arg Lys Asn Asn Asn Lys Arg Trp Tyr Phe Thr Arg 
15 10 15 

Glu Gin Leu Glu Asn Ser Pro Ser Arg Arg Phe Gly Val Asp Pro Asp 
20 25 30 

Lys Glu Leu Ser Tyr Arg Gin Gin Ala Ala Asn Leu Leu Gin Asp Met 
35 40 45 

Gly Gin Arg Leu Asn Val Ser Gin Leu Thr lie Asn Thr Ala lie Val 
50 55 60, 

Tyr Met His Arg Phe Tyr Met lie Gin Ser Phe Thr Arg Phe Pro Gly 
65 70 75 80 

Asn Ser Val Ala Pro Ala Ala Leu Phe Leu Ala Ala Lys Val Glu Glu 
85 90 95 

Gin Pro Lys Lys Leu Glu His Val lie Lys Val Ala His Thr Cys Leu 
100 105 110 

His Pro Gin Glu Ser Leu Pro Asp Thr Arg Ser Glu Ala Tyr Leu Gin 



115 



120 



125 



Gin Val Gin Asp Leu Val lie Leu Glu Ser lie lie Leu Gin Thr Leu 
130 135 140 

Gly Phe Glu Leu Thr lie Asp His Pro His Thr His Val Val Lys Cys 
145 150 155 160 

Thr Gin Leu Val Arg Ala Ser Lys Asp Leu Ala Gin Thr Ser Tyr Phe 
165 170 175 

Met Ala Thr Asn Ser Leu His Leu Thr Thr Phe Ser Leu Gin Tyr Thr 
180 185 190 

Pro Pro Val Val Ala Cys Val Cys lie His Leu Ala Cys Lys Trp Ser 
195 200 205 

Asn Trp Glu lie Pro Val Ser Thr Asp Gly Lys His Trp Trp Glu Tyr 
210 215 220 

Val Asp Ala Thr Val Thr Leu Glu Leu Leu Asp Glu Leu Thr His Glu 
225 230 235 240 

Phe Leu Gin lie Leu Glu Lys Thr Pro Asn Arg Leu Lys Arg lie Trp 
245 250 255 

Asn Trp Arg Ala Cys Glu Ala Ala Lys Lys Thr Lys Ala Asp Asp Arg 
260 265 270 

Gly Thr Asp Glu Lys Thr Ser Glu Gin Thr lie Leu Asn Met lie Ser 
275 280 285 

Gin Ser Ser Ser Asp Thr Thr lie Ala Gly Leu Met Ser Met Ser Thr 
290 295 300 

Ser Thr Thr Ser Ala Val Pro Ser Leu Pro Val Ser Glu Glu Ser Ser 
305 310 315 320 

Ser Asn Leu Thr Ser Val Glu Met Leu Pro Gly Lys Arg Trp Leu Ser 
325 330 335 

Ser Gin Pro Ser Phe Lys Leu Glu Pro Thr Gin Gly His Arg Thr Ser 
340 345 350 

Glu Asn Leu Ala Leu Thr Gly Val Asp His Ser Leu Pro Gin Asp Gly 
355 360 365 

Ser Asn Ala Phe lie Ser Gin Lys Gin Asn Ser Lys Ser Val Pro Ser 
370 375 380 

Ala Lys Val Ser Leu Lys Glu Tyr Arg Ala Lys His Ala Glu Glu Leu 
385 390 395 400 



Ala Ala Gin Lys Arg Gin Leu Glu Asn Met Glu Ala Asn Val Lys Ser 
405 410 415 



Gin Tyr Ala Tyr Ala Ala Gin Asn Leu Leu Ser His His Asp Ser His 



420 



425 



430 



Ser Ser Val lie Leu Lys Met Pro lie Glu Gly Ser Glu Asn Pro Glu 
435 440 445 

Arg Pro Phe Leu Glu Lys Ala Asp Lys Thr Ala Leu Lys Met Arg lie 
450 455 460 

Pro Val Ala Gly Gly Asp Lys Ala Ala Ser Ser Lys Pro Glu Glu lie 
465 470 475 480 

Lys Met Arg lie Lys Val His Ala Ala Ala Asp Lys His Asn Ser Val 
485 490 495 

Glu Asp Ser Val Thr Lys Ser Arg Glu His Lys Glu Glu Arg Lys Thr 
500 505 510 

His Pro Ser Asn His His His His His Asn His His Ser His Lys His 
515 520 525 

Ser His Ser Gin Leu Pro Val Gly Thr Gly Asn Lys Arg Pro Gly Asp 
530 535 540 

Pro Lys His Ser Ser Gin Thr Ser Asn Leu Ala His Lys Thr Tyr Ser 
545 550 555 560 

Leu Ser Ser Ser Phe Ser Ser Ser Ser Ser Thr Arg Lys Arg Gly Pro 
565 570 575 

Ser Glu Glu Thr Gly Gly Ala Val Phe Asp His Pro Ala Lys lie Ala 
580 585 590 

Lys Ser Thr Lys Ser Ser Ser Leu Asn Phe Ser Phe Pro Ser Leu Pro 
595 600 605 

Thr Met Gly Gin Met Pro Gly His Ser Ser Asp Thr Ser Gly Leu Ser 
610 615 620 

Phe Ser Gin Pro Ser Cys Lys Thr Arg Val Pro His Ser Lys Leu Asp 
625 630 635 640 

Lys Gly Pro Thr Gly Ala Asn Gly His Asn Thr Thr Gin Thr lie Asp 
645 650 655 

Tyr Gin Asp Thr Val Asn Met Leu His Ser Leu Leu Ser Ala Gin Gly 
660 665 670 

Val Gin Pro Thr Gin Pro Thr Ala Phe Glu Phe Val Arg Pro Tyr Ser 
675 ■ 680 685 

Asp Tyr Leu Asn Pro Arg Ser Gly Gly lie Ser Ser Arg Ser Gly Asn 
690 695 700 



Thr Asp Lys Pro Arg Pro Pro Pro Leu Pro Ser Glu Pro Pro Pro Pro 
705 710 715 720 



Leu Pro Pro Leu Pro Lys 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
TTCCCACCAA TGCTTTCC 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
.(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
CCATCAGTTG ATACAGGGAT CT 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
GGAATTCAGA AGGTTGTAAG ATGC 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 
ACACACAGAT GTGGTGAAAT GTACCCA 



(2) INFORMATION FOR SEQ ID NO: 55: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 
GCATCTTACA ACCTTCTG 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 
GGAATTCATG GAAAGCATTG GTGGGAAT 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
CCTCCACTAC TGGTTTGCCT GG 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
GGACTAGTAT AAATATGGCG TCGGGCCGTG 



(2) INFORMATION FOR SEQ ID NO:59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 
GG AG ATCTT A CATGTTCATT CCTTGGG 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 
GGAGACAAGT ATGTGCTACC TTGATGACA 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
GGAATTCGGG CTGCTCCTCC ACTTTAG 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 
GGAATTCGCT GCTGGAGCCA CAGAA 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 
GTGTCACTGA AAGAATACCG 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GGAATTCAGG TGGAGATAAA GCTGC 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GCTCTAGATA AATATGGAGG GAGAGAGGAA 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GGAATTCTTA CTTAGGAAGG GGTGGAAGTG 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GGAATTCTTA CTTAGGAAGG GGTGGAAGTG GTGGAGGAGG TTAC 



(2) INFORMATION FOR SEQ ID NO: 68: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 



(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Ala Cys Ser Tyr Ser Pro Thr Ser Pro Ser Tyr Ser Pro Thr Ser Pro 
15 10 15 



Ser Tyr Ser Pro Thr Ser Pro Ser Lys Lys 
20 25 



